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1  B  SI  1  1  C  1 

Ranking  and  selection  (R&S)  techniques  are  statistical  methods  developed  to  select  the  best  system,  or 
a  subset  of  systems  from  among  a  set  of  alternative  system  designs.  R&S  via  simulation  is  particularly 
appealing  as  it  combines  modeling  flexibility  of  simulation  with  the  efficiency  of  statistical  techniques 
for  effective  decision  making.  The  overwhelming  majority  of  the  R&S  research,  however,  focuses  on 
the  expected  performance  of  competing  designs.  Alternatively,  quantiles,  which  provide  additional 
information  about  the  distribution  of  the  performance  measure  of  interest,  may  serve  as  better  risk 
measures  than  the  usual  expected  value.  In  stochastic  systems,  quantiles  indicate  the  level  of  system 
performance  that  can  be  delivered  with  a  specified  probability.  In  this  paper,  we  address  the  problem 
of  ranking  and  selection  based  on  quantiles.  In  particular,  we  formulate  the  problem  and  characterize 
the  optimal  budget  allocation  scheme  using  the  large  deviations  theory. 

1  ISTRODDCTION 

Ranking  and  selection  (R&S)  techniques  are  statistical  methods  developed  to  select  the  best  system, 
or  a  subset  of  systems  from  among  a  set  of  alternative  system  designs.  R&S  via  simulation  is 
particularly  appealing  as  it  combines  the  modeling  flexibility  of  simulation  with  the  efficiency  of 
statistical  techniques  for  effective  decision  making.  Furthermore,  simulation  experiments  also  allow 
for  multi-stage  sampling  as  required  by  some  R&S  methods.  Due  to  randomness  in  output  data, 
however,  comparing  a  number  of  simulated  systems  requires  care.  If  the  precision  requirement  is  high 
and  if  the  total  number  of  designs  in  a  decision  problem  is  large,  then  the  total  simulation  cost  may 
be  prohibitively  high,  limiting  the  utility  of  simulation  for  R&S  problems.  The  effective  deployment 
of  the  simulation  budget  in  R&S  is  therefore  crucial. 

The  overwhelming  majority  of  the  R&S  research  focuses  on  the  expected  performance  of  competing 
designs.  Alternatively,  quantiles,  which  provide  additional  information  about  the  distribution  of  the 
performance  measure  of  interest,  may  serve  as  better  risk  measures  than  the  usual  expected  value.  In 
stochastic  systems,  quantiles  indicate  the  level  of  system  performance  that  can  be  delivered  with  a 
specified  probability.  For  example,  in  the  financial  services  industry,  Value  at  Risk  (VaR),  a  quantile 
of  a  portfolio’s  profit  or  loss  over  a  period  of  time,  is  a  standard  tool  to  assess  the  risk  of  that  portfolio. 
Similarly,  in  the  service  industry  (e.g.,  health  care  or  telecommunications),  quantiles  are  used  as  an 
indicator  for  the  quality  of  service.  In  project  management,  stochastic  activity  networks  are  used 
to  represent  complex  projects.  In  such  an  environment,  planners  may  wish  to  compute  an  upper 
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bound  on  the  completion  time  of  the  project  that  would  hold  with  high  probability.  Similarly,  in 
a  newsvendor  setting,  where  a  procurement  or  production  quantity  must  be  determined  before  the 
market  uncertainties  are  resolved,  the  optimal  quantity,  the  one  that  maximizes  expected  profit,  is 
given  by  the  quantile  driven  by  the  demand-supply  mismatch  costs.  Finally,  in  simulation  analysis  (or, 
more  generally,  in  statistics),  the  critical  values  for  test  statistics,  confidence  intervals,  and  sequential 
sampling  procedures  are  expressed  as  quantiles. 

The  estimation  of  quantiles,  however,  differs  considerably  from  that  of  expectations.  A  thor¬ 
ough  review  of  quantile  estimation  for  independent  and  identically  distributed  (IID)  data  is  given 
by  Serfling  (1980).  To  improve  quantile  estimation,  authors  such  as  FIsu  and  Nelson  (1990)  and 
Flesterberg  and  Nelson  (1998)  apply  control  variates,  while  Glynn  (1996)  uses  importance  sampling, 
and  Avramidis  and  Wilson  (1998)  deploy  correlation-induction  strategies  to  obtain  variance  reduction 
in  simulation-based  quantile  estimation.  Closer  to  our  work,  Jin,  Fu,  and  Xiong  (2003)  provide  proba¬ 
bilistic  error  bounds  for  simulation  quantile  estimators  using  large  deviations  techniques.  Flong  (2009) 
develops  an  estimator  based  on  infinitesimal  perturbation  analysis  while  Liu  and  Flong  (2007)  de¬ 
velop  kernel  estimators  for  assessing  quantile  sensitivities.  Batur  and  Choobineh  (2009)  have  recently 
introduced  approaches  for  quantile-based  system  selection. 

In  this  paper,  we  address  the  problem  of  identifying  the  populations  that  correspond  to  the  m 
smallest  quantiles  by  sampling  independently  from  d  populations.  By  using  the  large  deviations 
framework,  we  characterize  the  optimal  sampling  (or  budget  allocation)  scheme  that  minimizes  the 
probability  of  incorrect  selection  given  a  fixed  sampling  budget.  The  remainder  of  the  paper  is 
organized  as  follows:  in  the  next  section,  we  formally  define  the  problem.  We  then  characterize  the 
budget  allocation  scheme.  As  this  characterization  leads  to  a  difficult,  nested  optimization  problem, 
we  turn  our  focus  to  a  special  case,  where  we  wish  to  identify  those  populations  whose  quantiles 
exceed  a  threshold  value.  We  conclude  the  paper  with  a  number  of  simple  illustrations. 

1  PROBLEM  DEFINITION 

Suppose  we  have  d  populations  from  which  we  can  independently  sample.  Let  X,  be  a  random  variable 
sampled  from  population  i  with  distribution  function  /•/(■).  Let  q\  be  the  ,-quantile  of  population  i; 
that  is 


qt  =  inf {k  :  Fj(k)>  ,}. 

Throughout  we  assume  that  (F,(-)  :  i  =  1 ..... ri)  and  (<:/,:/  =  I .... . d)  are  unknown,  and  that 
0  <  .-Cl.  The  goal  is  to  determine  the  populations  that  correspond  to  the  m  smallest  quantiles, 
where  the  777’th  smallest  quantile  is  different  than  the  m  +  l’st  smallest  quantile.  Hence,  without  loss 
of  generality,  we  suppose  that 


q\<qi<-<  qm  <  qm+ 1  <•  •*<?<*■ 


The  simulation  budget  is  n,  p  =  (p  1, . . . ,  p,i )  is  the  vector  of  fractional  allocations,  and  rq  =  [npi]  is 
the  sample  size  of  population  7.  Let  (X{j, :  k  =  1, ...  ,77,)  be  a  collection  of  IID  random  samples  drawn 
from  Fj,  and  X,-  \  <  ■  ■  •  <  X,- the  ordered  samples  of  population  i.  The  , -quantile  estimator  is 

Xj  -  where  f-]  is  the  ceiling  operator. 

To  simplify  the  notation  define  the  sets  .<//  =  {1, ... ,777}  and  SS  =  {777  +  1, . . . ,d}.  An  incorrect 
selection  (IS)  occurs  when  max,-6i/X,-  p  >  m\njC,'^X A  lower  bound  for  P(IS)  is 


■  5a^a»P(Xi.r  M:ni>Xj,\  jnj]:nj)<P(IS), 

tesX.jedg  1  1  1  11  1 
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and  an  upper  bound  for  PilS)  is 

P{IS)  =  P{Ui£jrfj£LjgXj  j  jni\:ni  —Xjt  [  jrij\.rij)  ^  \^\  x  \^\  .  max  \  mfrii  —  ^j,\  jnf-.rij)- 


Hence,  if 


^logP(Xi  \  ini\:ni>Xjj  jnj\:nj)  *  Gij(pi,pj) 

as  n  — >  for  some  rate  function  G;j,  we  have  that 

-logP(/S)->-  min  Gu{pupj). 

Tl 

as  n->  .  The  rate  functions  Gij(pi,pj)  depend  on  the  large  deviations  of  XL-  which  are 

treated  next. 

In  preparation, 

J  CONTINUOUS  CASE 

If  Xj  has  density  /)(■),  then  it  can  be  shown  (Serfling  (1980),  pp.85)  that  X-Lr  jn.yn.  has  the  density 

iii  —  1 


A»/o=«.|r  ;/1_1j[«]r  M~l[i -m)]nH 

For  —  <  <  and  t  in  the  support  of  Xj  dehne 


8i(t)  =  t+  ( log  (  F^-  )  +  (1  -  ,)log 


1-^(0 

1-  , 


(1) 


and 


i,m(  )  =log£'exp( 

When  g,  (■)  is  strictly  concave  and  twice  differentiable,  it  has  a  unique  global  maximizer  ,(  )  satisfying 
g'j(t)  =  0.  Observe  that  if  0  <  i  <  1  then  0  <  F(  ;■(  ))  <  1,  for  otherwise  g(  ,(  ))  =  —  and  we 
know  that  gfqf)  =  qi  is  feasible  and  greater  than  —  .  Let 

.■()  =  «.■(/())■  (2) 

Proposition  1 ,  If  population  i  has  a  density  f{t)  with  bounded  first  derivative  and  the  function  g,-(-) 
is  twice  differentiable  with  supg^f)  <  0,  then 

lim  —  i  n  (m  )  =  f  ). 

«f- ►  Uj 
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Proof.  From  the  definition  of  f  nj  we  have 


(3) 


=  log 


=  log  «, 


"i(r  n!ni]  ij))  +log / exp (m  *M*)]r  Mfi(t)dt 

+  log /exp (n,(g,(t)+  t log (  ,-)  +  (!-  ;)log(l-  i)))Ri(t)f(t)dt, 


m  - 1 

|  m]  - 1 


where  =  [F,-(t)]r  '">'1  ‘n‘  1  [1  —  Fi(t)]ni  ‘  I"  Since  g,-(f)  is  twice  differentiable  everywhere, 

Taylor’s  Theorem  (see  Serfling  (1980))  for  gft)  around  its  global  minimum  ,(  )  yields 


gi(t)=8i(  i(  ))+(r  ^  ))2g”(  ), 

where  lies  between  ,(  )  and  r.  Plugging  (4)  in  (3)  and  dividing  through  by  nh  we  have 

+  gi{  i(  ))+  «l°g(  /)  +  (!“  /)l°g(l_  !')  + 


(4) 


i,m(ni  )  =  — log  (  /?, 


n; 


n; 


rij  —  1 

i«il  -  i, 

^-logjexp^jf-  i(  )fg"{  )^Ri(t)fi(t)dt.  (5) 


The  binomial  term  on  the  right-hand  side  of  (5)  becomes 

rij  —  1  \  rij ! 


rii 


\  inf  -  1 J  («/-  r  i«il)!(f  rn  1-1)!' 


and  Stirling’s  formula  leads  to 


1 


lim  —log 


ni\ 


«i  VK-f  inm\  inf -1)1 

Changing  variables  yields 


=  -(l-  /) log(l  f-  ;log(  /). 


(6) 


~  log  J  exp  ( y  (f  —  «(  ))V(  ))R(t)fi(t)dt  =  -^logJ  exp(^-g"(  )^ /?(,-(  )+fn  1/2)/)(  /(  )+tn  l/2)d 


Expanding  R(-)  and  /)(■)  about  ,(  )  results  in 

R(  ,-(  )  +  m-1/2)=F(  ,-(  ))  +  4^(  i) 


(7) 


and 

/H  ,-(  )+m-1/2)  =  /;.(  .(  ))  +  -Ly/(  2), 

\Jn 

for  iand  2  between  ,(  )  and  ,■(  )+fn_1,/2. 

We  saw  earlier  that  0  <  F(  ,■(  ))  <  1,  which  results  in  R(  ,(  ))  and  /?'(,(  ))  finite  in  (7).  The 
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two  assumptions  then  lead  to 

~j]2  log  j  exP  «’(  )+tn~1/2)M  «(  )+fn_1/2)<*f 

=  -Jj2l°Z  J  eXP  (y^(  ))  «’(  ))■/«'(  «'( 

0,'  '  (8) 

as  Yii  — >  . 

We  conclude  from  (5),  (6),  and  (8)  that  lim„_>  nj-1  i,ni(ni  )=  ;(  )■  □ 

I  DISCRETE  CUE 

Towards  stating  an  analogous  result  for  the  discrete  case,  let  us  use  a  narrower  definition  of  a  quantile. 
Let  qi  be  the  /-quantile  of  population  i,  meaning  that 


Fidi)  —  i 


Suppose  Xj  is  supported  on  the  countable  set  «£? .  Then  (ignoring  issues  due  to  non-integral  m  /), 
it  is  seen  that  Xt  -  has  the  probability  mass  function 


Pr{%  , 0 


m 

r  i«ii 


([F,(0]r  ini]-mny  inil)[i -Fiitw- 


teSf 


where  F(t~)  =  Pr{X,-  <  t). 

Before  we  state  the  main  result  for  the  discrete  context,  we  note  the  following  simple  proposition 
without  proof. 

Proposition  1  ,  Let  -  be  a  finite  number  of  positive-valued  sequences  with 


Then, 


lim  -  log a:n  =  a  j. 
n—>  n 


lim  -  log 

n—>  n 


=  Max;{a7}. 


We  are  now  ready  to  state  the  main  result  in  the  discrete  context. 


Proposition  3  ,  Suppose  Xj  has  finite  support  Jzf,  and  satisfies  Pr{Xj  =  t}  >  0  for  each  t  £  «Sf. 
Furthermore,  suppose  that  the  function  gi(-)  has  a  unique  maximum  at  ,-(  )  and  that  g  ft)  is  strictly 
increasing  (decreasing)  for  t  <  ,•(  )  (resp.,  t  >  ,  (  )).  Then 


lim  —  infm 

rii 
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Proof.  Denote  pft)  =  Pr{Z(- j  .n.yn.  =  t},  tm  =  max{f  :  t  £  2z?},  and  =  Jf\  {tm}.  (Since 
Pi{tm)  =  0,  we  see  that  gi(tm)  =  —  and  hence  ,■(  )  7^  tm.)  We  have 


lim  —  in  {m  )=  lim  —log  I  pi(t)exp{ni  t}  I  . 

nj—  m  in— >  m  1  1 


( 


\teS?' 


We  will  show  that 


(9) 


lim  —  log  (/?,•(*)  exp{n,-  r})  =  gft)  Vf  £  ££' . 

« >  Hi 

The  assertion  of  the  theorem  then  follows  from  applying  Proposition  2  to  (9)  and  (10). 
To  show  (10),  we  notice  that 


(10) 


1 


n; 


log {Pi(t)exp{rii  f}) 


=  t  +  — log  Pi(t) 


=  f  +  -log 
ni 

=  f  +  —  log 


k  /I 


n i  \  n, 


)+-log(Fi(t)^  '1- Ffr )r»'  /I )  4.  k — k-dll iog( !  _ Fft)) 

J  Mi 

i+ 

«(  Pi[tyn'  ' 

Now,  through  an  application  of  Stirling’s  formula  we  see  that  the  second  term  appearing  on  the 
right-hand  side  of  (11)  satisfies 


lim  -log  ( 


=  lim  —  log 


n;\ 


n> -»  ni  Vk  i]J  n‘~ «i  6Vk-r  f«/l ) ! r 

=  -(1-  ,)log(l  -  ,)-  i log (  i)  —  —H{  i). 


(12) 


Next,  we  see  that  since 


il  -Fj(t-fni  <1  - 


/HO1"''  ;l 


is  arbitrarily  close  to  1  for  large  enough  n(,  the  fourth  term 


appearing  on  the  right-hand  side  of  (11)  satisfies 


1  /•;•(/)>  '1  .  /•;•(/  )>''  '■ 

lim - r  .  - =  0. 

w->  ni  Fj(tyn‘  <1 


(13) 


Finally,  the  third  and  fifth  terms  appearing  on  the  right-hand  side  of  (11)  satisfy 
lim  —  logF,(t)^"'  +  lim  — — log(l -Fi(t))=H(  ,-)  +  /log(^^)  +  (l-  ^log^  - 

«,—*  llj  ni->  m  i  1  —  i 

(14) 

Using  (12),  (13),  and  (14)  in  (11),  we  get 

lim  —  log  (/?,•(*)  exp{n,-  t})  =  t-H(  t)+H(  ,-)  +  ,-log(^^)  +  (1  -  ,)1og( 1  F'^) 

«'-*  nt  i  1  -  t 

=  gi(*), 
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and  thus  (10)  holds.  □ 

i  Q  II 11  Tit  1  SELECTION 

Propositions  1  and  3  can  be  used  to  obtain  an  expression  for  the  exponential  decay  rate  of  the  incorrect 
selection  probability,  in  terms  of  the  sampling  budget  allocation.  Let  4  (x)  =  sup  {  x  —  /.(  )}bethe 
rate  function  corresponding  to  population  k.  In  the  continuous  setting,  for  x  such  that  0  <Fk(x)  <  1, 
Proposition  1  leads  to 


=  *(  )+  i(  ) 

=  k(  ), 

so  that 

4W=  ilog{m)+{1~  <15) 

A  similar  argument  for  the  discrete  case  shows  that  Eq.  (15)  is  valid  there  as  well. 

Let  Zn  =  (Xj  -  :nj,Xjj  Then,  as  shown  in  Glynn  and  Juneja  (2004),  the  rate  function 

of  (Z„  :  n  >  0)  is  given  by  P'Ji(xi)  +  pjfj(xj),  and  applying  the  Gartner-Ellis  Theorem  results  in 


+ 


4(  k(  )) 


A(*(  ))- 


l  - 


1-4 


k{  k\ 


)) 


/*(  k(  )) 


GijipuPj )  =  inf  {pJiixi)  +pjlj(xj)}. 

xi  —x j 

If  Fj(qm )  <  1  ,Vi  G  .ft/  and  Fj(q\)  >  0.  V/  G  then  the  rate  functions  Gjj(pi,pj)  are  finite  for  any 
feasible  allocation  Pi,Pj.  Furthermore,  since  4  (x)  is  strictly  decreasing  forx  <  <r//c  and  strictly  increasing 
for  x  >  qk,  we  must  have  Gjj(pi,pj )  =  inft{p,//(x)  +  p;/;(x)}. 

An  optimal  allocation  p  maximizes  min i^jeSsGij{pi,Pj),  which  is  the  same  as 


max 


s.t. 


and 


—  Gjj(pi,pj)  <  0,  Vi  G  £/  and  V/'  G  23 

d 

Pi<  1 ,  Pi  >  o. 

(=t 


The  hrst-order  conditions  are  necessary  for  optimality  (same  argument  as  in  Glynn  and  Juneja  (2004)). 
They  are 


G,j(p*,P*) 

Pj 


tj 


Vi  G 


Gi.APl-Pj) 


jedS 


hJ 


Y/  G 
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iestjedg 


ij 


=  1, 


and 


i,j (  -  Gij (p*,P*j))=  0  Vi  £  srf ,Vy  £  ^ 


where  ,-j  >  0  for  all  i  £  p/,j  £  and  >  0. 

It  can  be  shown  that 

mmGij(p-i.pj) 

*  Vies/, 

and  that 

mmCijipfp*)  = 

*  V/  £ 

for  some  *  >  0. 

5 . 1  C  m !  in  j  i  1 1  m  I  i  II 


Getting  insights  about  the  optimal  allocation  appears  very  difficult  because  we  have  a  nested  opti¬ 
mization  problem.  It  is  easier,  however,  to  characterize  the  allocation  that  minimizes  the  probability  of 
crossing  a  threshold  c  £  [qm , qm+ 1].  Let  ISC  be  the  event  (Uie^Xit f  >  c)  U  {Uj&3SXj) p  ,n.-\ ,n.  <  c) . 

Then  we  have 

P(IS)  <  P(ISC)  <  P(xii{  ,.n;1  ,ni  >c)+  P(Xjj  .nj] ,n.  <  c) . 

test  jeSS 

Using  an  argument  similar  to  the  one  presented  in  Szechtman  and  Yiicesan  (2008),  we  get 

-log(  P(Xi:\  >c)  +  P(Xjj  .njyM.<c))^-min{pih(c),...,pdId(c)} 
n  test  jeSS 

as  n  — >  .  Following  Szechtman  and  Yiicesan  (2008),  the  optimal  allocations  are 

= _ 4-1(c) _ 

testii\c)+  jesgij\cy 


leading  to 

lim  sup-logP(/S)  <  —  (  If 1  (c)  +  /^(c))"1. 

"  ‘  n  icst  J 

That  is,  the  optimal  threshold  is  the  one  that  minimizes  test!fX  (c)  +  jegglfl  (c)  over  c  £  [qm,qm+\], 

1  C  0  N  C  L  0  D  IN  G  REM  ARKS 

In  this  paper,  we  addressed  the  problem  of  identifying  the  populations  that  correspond  to  the  m  smallest 
quantiles  by  sampling  independently  from  d  populations.  Using  a  large  deviations  framework,  we 
characterized  the  optimal  sampling  (or  budget  allocation)  scheme  that  minimizes  the  probability  of 
incorrect  selection  given  a  sampling  budget  that  grows  to  infinity.  In  particular,  the  optimal  budget 
allocation  arises  as  the  solution  of  a  3-layer  nested  optimization  problem.  The  threshold  crossing 
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problem,  where  we  wish  to  identify  those  populations  whose  quantiles  exceed  a  threshold  value,  leads 
to  more  tractable  budget  allocations. 
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